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(54) Push-pull services for the internet 

(57) In an arrangement that employs a push-pull 
paradigm, information that is to be communicated to cli- 
ents is broadcast, or multicast, to cache servers, where- 
in the information is cached in preparation for its being 
pulled by clients. By pushing information to points close 
to the clients, both source overload and network over- 
load are avoided. The pushed information in the dis- 
closed approach is transmitted over Internet links, or 
over other communication channels, such as cable and 
radio systems. In operation, clients subscribe to specific 
services of the provider, an association is established 
between the subscribing client and a cache server, and 
the server informs the network that it should be included 
among the destinations to which information from the 
provider is transmitted, when information is subsequent- 
ly transmitted by the provider and received by the cache 
servers, it is stored in the cache server in preparation 
for its being pulled by the clients, as desired and when 
desired. 
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Description 

Background of the Invention 

[0001] This invention relates to Internet services and, 
more particularly, to "push" type communication servic- 
es through the Internet where transmission of informa- 
tion is initiated by a source. 

[0002] A recent survey has shown that 1 8% of Internet 
traffic is due to push services provided by companies 
like PointCast, where information is collected from di- 
verse sites on the Internet and is made available to cli- 
ents in distinct, categorized, channels. Clients pre-sub- 
scribe to those channels and, to a client connected to 
such a site, the information appears to be available to 
the client's browser by simply selecting the categories, 
or "channels", without any browsing the Internet. Hence, 
the term "push" is used, as contrasted to the term "puir, 
where the client pulls information from various different 
source sites after browsing the Internet and selecting 
the sources one by one. To the client, this arrangement 
appears much like cable TV, where premium channels 
are subscribed to and, once subscribed to, are always 
available to the client. 

[0003] Although the familiar cable TV channel sub- 
scriptions paradigm corresponds to a multicast arrange- 
ment, the present day push service providers do not ac- 
tually multi-cast any information. Rather, they browse 
the Internet, cache all of the information that they will 
offer in the various channels (and perhaps create 
some), and wait for clients to connect to their respective 
servers. Those clients that do not encounter server 
overload from too many connected clients, and thus 
succeed in connecting to the server, are sent informa- 
tion, as requested, in the normal TCP/IP manner. In re- 
ality, then, the offered service of present day push serv- 
ice providers is really a "pull" service from a single 
source (that is, under control of the push service provid- 
er), as contrasted to a "pull" service from individual 
browsed sources. 

[0004] One problem with the push service approach 
is that it does not scale very well. Aside from the server 
overload problem mentioned above, there is also a po- 
tential network overload problem, when many clients 
want to receive information, they all have to gain access 
to the site, open up separate TCP/IP connections to the 
server and retrieve information packets. When those 
packets relate to the same information that is requested 
by many clients, duplicate packets are pulled through 
the network. This results in unnecessary traffic through 
the network. 

[0005] Internet does provide for multi-casting capabil- 
ities through Class D Group addressing, whereby users 
dynamically join or leave a group by means of the IGMP 
protocol. A multicast tree is set up, and intermediate 
routers replicate the transmitted information along the 
branches of the tree. This relieves the transmission bur- 
den on the site that provides the information, because 



the information gets replicated in the network. However, 
the resulting overall traffic on the network is not much 
lower, and an overhead must be suffered in the proc- 
esses that dynamically set up multicast trees, join exist- 
5 ing trees, leave existing trees, and dismantle existing 
trees. Moreover, the pushed information must be re- 
ceived when it is sent. 

[0006] In a separate art, push technology is used ex- 
tensively in satellite, cable, and conventional radio ap- 

10 plications, where information is broadcast to all clients 
who are passive listeners. Adopting a true push ap- 
proach would clearly overcome the server overload 
problem and the routing overhead problem. However, it 
would introduce other problems. For example, requiring 

is the browsers of client computers to accept information 
whenever some transmitting point chooses to push in- 
formation would require major modifications to the 
browsers that are currently available. Also, requiring cli- 
ent computers to accept and store large amounts of data 

20 that the user may, ultimately, choose not to look at plac- 
es an undue burden on the client computers. Further, 
some corporate environments use "firewalls" which 
strictly control what is allowed to come into the corporate 
network, and no multicast or broadcast traffic is allowed. 

25 

Summary of the Invention 

[0007] Problems associated with the pull paradigm 
and with the push paradigm are overcome by employing 
30 a caching server architecture with a push-pull paradigm. 
Information that is to be communicated to clients is 
broadcast, or multicast, and pushed to cache servers, 
wherein the information is cached in preparation for it 
being pulled by clients, when desired. By pushing infor- 
ms mation to points close to the clients, both source over- 
load and network overload are avoided. The pushed in- 
formation in the disclosed approach is transmitted over 
the Internet, or over other communication channels, 
such as cable and radio systems. Actually, cable and 
40 radio are channels that are naturally suited to broadcast- 
ing and, therefore, are particularly advantageous. The 
process of a provider pushing information to the cache 
servers is controlled, in part, by clients who choose to 
subscribe to specific services of the provider. As part of 
45 the subscription process, an association is established 
between the subscribing client and a cache server, and 
the cache server informs the network that it should be 
included among the destinations to which information 
from the provider is transmitted. When information is 
so subsequently transmitted by the provider and received 
by the cache server, it is stored in the cache server in 
preparation for its being pulled by the clients, as desired 
and when desired. 

[0008] In applications where firewalls are set between 
55 a corporate network and the rest of the Internet, a relay 
agent is installed at the firewall gateway to serve as the 
cache server. In applications where client PCs connect 
at will to the Internet through an Internet service provider 
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(ISP), the provider specifies the association between 
the clients and particular cache servers. Typically, the 
cache server that is associated with a client is closer to 
the client (from the standpoint of number of Internet 
nodes that need to be traversed) than the servers that 
provide the information. 

Brief Description of the Drawings 

[0009] FIG. 1 presents an arrangement for imple- 
menting the push-pull service of this invention; and 
FIG. 2 presents the FIG. 1 arrangement with wireless 
and cable means for broadcasting information to the 
cache servers. 

Detailed Description 

[0010] In the context of this disclosure, a push service 
provider is a provider that offers information to subscrib- 
er clients in a manner such that to a client the information 
appears to be readily present in the client's computer. 
This includes the type of providers described above in 
the "Background of the Invention" section. 
[0011] From the client's perspective with respect to 
minimizing latency, the ideal situation is for the push 
service provider to a priori install on the client's compu- 
ter all of the information that the client might wish to view. 
In this way, the information is as readily available as is 
possible. That would be a true push service. For the rea- 
sons discussed above, however, this is not a viable ap- 
proach. Therefore, in accordance with the approach dis- 
closed herein, instead of pushing information to a cli- 
ent's computer, the provider pushes information to a 
nearby cache server that is associated with the client's 
computer. Typically, such a cache server is located in 
the vicinity of the client which, in the context of this dis- 
closure, means a location to which the client can con- 
nect with the least burden on the Internet network. A "vi- 
cinity" would correspond to a small number of Internet 
nodes between the client and the cache server. Some 
practitioners might also account for the bandwidth that 
may be available. Thus, a cache server that is two nodes 
away with very high bandwidth links might be selected 
over a cache server that is one node away but which 
has a relatively narrow band link. 
[0012] Although the approach disclosed herein is 
more of a "push" service than the one currently availa- 
ble, it is still not a truly "push" service. For sake of ac- 
curacy, the following refers to the disclosed service as 
"push-pull" service. The provider of such a service is, 
nevertheless, referred to as a "push service provider", 
as stated above, because that is the impression that 
such providers wish to leave with their clients. 
[0013] In accordance with the disclosed approach, 
the push service provider multicasts or broadcasts to 
cache servers all of the files that the cache servers need 
in order to fulfill the subscription obligations of the asso- 
ciated clients. When clients desire information, they pull 



it from the cache servers - instead of from the host that 
belongs to the push sen/ice provider - using convention- 
al browsers. The files pushed to the cache servers might 
be static image files, video clip files, voice segment files, 

5 etc. In connection with files created by the push service 
provider, every time an updated file is generated, it is 
transmitted to the cache servers where it replaces the 
old file. When a push service provider discards a file, a 
message is sent to the cache servers to discard the cor- 

10 responding file. In connection with files created by oth- 
ers and adopted, so to speak, by the push service pro- 
vider, the latter checks the source of the files at some 
selected regularity and updates the cache servers ap- 
propriately. 

15 [0014] FIG. 1 presents a drawing of the salient ele- 
ments of the Internet which will assist in understanding 
the various aspects of this invention. Host 10 is a com- 
puter that provides a push service. It is connected to the 
Internet via router 101. Router 101, routers 102-107, 

20 and interconnecting links 201-213 form the Internet. 
Cache servers 301 , 302, and 303 are connected to rout- 
ers 105, 107, and 106, respectively, and clients are con- 
nected to some of the routers. Specifically, client 401 is 
connected to router 1 02, client 402 is connected to rout- 
es er 1 05, clients 403 and 404 are connected to router 1 07, 
and client 405-407 are connected to router 106. 
[0015] FIG. 1 also shows a corporate network that 
comprises router 109, 110, and 111 that are intercon- 
nected via links 2 1 4, 21 5, and 216, and clients 41 0,411, 

30 and 412 coupled to router 110. The corporate network 
is connected to the Internet through a gateway "firewall" 
computer 500. Computer 500 includes a coupled cache 
server 501 that, effectively, is situated outside the "fire- 
wall" (i.e., on the Internet side and not on the corporate 

35 network side). 

[001 6] For the push service of this disclosure, the op- 
eration of the FIG. 1 network can be divided into a set- 
up phase, and a steady-state phase. During the set-up 
phase, the network is conditioned to bring information 

40 that is transmitted by host 10 to the various cache serv- 
ers that seek to store the information. Illustratively in 
FIG. 1 , the cache servers that need to receive informa- 
tion are cache servers 301, 302, 303, and 501. During 
the steady-state phase, information that is transmitted 

45 by host 10 is stored in cache servers 301 , 302, 303, and 
501 , and that information is pulled by any of the sub- 
scriber clients, at will, from their designated cache serv- 
ers. The pulling of information by corporate network cli- 
ents, such as client 410, is accomplished in accordance 

so with whatever protocol the guardians of the corporate 
network specify. 

Set-Up 

S5 [0017] The set-up phase can also be broken into two 
portions. The first is assigning cache servers to serve 
specific clients (not necessarily a static assignment), 
and the second is conditioning the network to insure that 



5 



EP 0 967 559 A1 



6 



appropriate cache servers receive the needed informa- 
tion. Illustratively, FIG. 1 shows a portion of the Internet 
network where an Internet Service Provider (ISP), e.g., 
AT&T, or America on Line, owns routers 102, 105, 106 
and 107, and where the shown clients (other than the 
corporate network clients) are served by that ISP. That 
is, these clients have an agreement with the ISP where- 
by the clients are provided access to the Internet in ex- 
change for a monthly fee. Illustratively, the ISP has cho- 
sen to connect a cache server to three of the four routers 
(excluding router 1 02), and through these cache servers 
the ISP provides its clients with the enhanced push-pull 
service disclosed herein (as well as other caching serv- 
ices). Presumably, the ISP has made arrangements with 
either its clients or with the provider that owns host 10 
for some extra compensation for use of its cache serv- 
ers. 

[0018] When a client, for example client 401 , wishes 
to subscribe to a push-pull service offered by the pro- 
vider that owns host 10, the client informs its ISP of this 
desire and causes the ISP to assign the client to a cache 
server. This is done, for example, by installing one or 
more entries in the DNS (Domain Name System) that is 
assigned to the client, which resolve, for this client, the 
Internet address of host 10 to that of different cache 
servers in the vicinity of the client. That address might 
even be the address of a cache server that is co-located 
with the node of the ISP to which the client dials in. In 
such a case, the cache server is at the ultimate periph- 
ery of the Internet network vis-a-vis the client. In the il- 
lustrative example of FIG. 1 , the ISP might select cache 
server 301 as the cache server for client 401. It should 
be noted that such an assignment need not be perma- 
nent, or static. For various reasons, such as load bal- 
ancing, the association of a client to a cache server can 
be changed (e.g., by simply modifying the appropriate 
entry in the client's DNS). Obviously, given a choice of 
two equally loaded cache servers, the server that is ad- 
vantageously selected is the one that least loads the In- 
ternet network. 

[0019] Having assigned the client to a cache server, 
the next step is to condition the Internet so that the ap- 
propriate cache servers, such as server 301 , would re- 
ceive the host 10 information that their clients subscribe 
to. Such conditioning may be effected by a standard IP 
multicasting protocol, such as the Internet Group Man- 
agement Protocol (IGMP). In accordance with this pro- 
tocol, host 10 sends a special packet that floods the In- 
ternet and specifies a group ID. Each router receives 
this packet from some of the links that are connected to 
the router, and forwards this packet to all of the links that 
are connected to the router from which this packet did 
not arrive. With respect to that particular host, the former 
links are the incoming links of the router, and the latter 
links are the outgoing links of the router. After the flood- 
ing message is sent, all routers respond. A router that 
a) has no cache server that wants transmissions to the 
special packet's group, and that b) has all of its outgoing 



links provide a pruning message response, outputs a 
pruning message to all of its incoming links. A router that 
does not meet both criteria outputs a pruning message 
to all but one of its incoming links. Links that pass a prun- 

s ing message are pruned from the tree. This results in a 
tree that defines the branches (links) through which 
packets transmitted by host 10 flow, where each cache 
server, as a receiver, or a leaf of the tree, has a path to 
host 10 through one or more routers. Creation of the 

10 routing tree can occur at a regular rate, such as every 
30 seconds. 

[0020] The IGMP protocol also permits a dynamic 
joining or leaving of the tree. A new cache server is add- 
ed by sending a grafting message to host 10, with the 
is path taken by the grafting message being established 
as part of the tree. Leaving a tree is done in a similar way. 
[0021] The above-described approach to multicasting 
is merely illustrative, of course, and other protocols can 
be used. 

20 

Steady-State 

[0022] The steady-state operation is, in a sense, 
straightforward. Host 10 multicasts information at what- 
25 ever rate it desires and, once the transmission tree is 
set up, the transmitted packets arrive at the cache serv- 
ers, wherein they are stored. Thereafter, the stored 
packets may be pulled by the clients, as desired and 
when desired. 

30 [0023] Most file transmission protocols on the Internet 
are of the "best effort" variety. For the arrangement dis- 
closed herein, it would be advantageous to employ a 
protocol that provides a greater assurance of successful 
file transmissions. This may be accomplished, for ex- 
35 ample, with an "application-layer" protocol (herein 
called EUReCa) which guarantees delivery of objects 
(such as files). This protocol insures that objects sent 
by a source machine (a sender) to any number of des- 
tination machines (receivers) actually arrive at the in- 
40 tended receivers even when the receivers are tempo- 
rarily unavailable, for example due to failure or due to 
network partition. EUReCa can be either sender-driven 
(EUReCa-S) or receiver-driven (EUReCa-R). 
[0024] In EUReCa-S, the sender explicitly keeps track 
45 of the status of every receiver through an Active Receiv- 
er List (ARL). That is, the sender knows the identity of 
the receivers (cache servers) that are supposed to re- 
ceive a transmitted object, and waits for each receiver 
to acknowledge every received object before proceed- 
50 ing with transmission of a next object. As an aside, a 
receiver can send an acknowledgment for every object 
it receives, can send a cumulative acknowledgment for 
a set of objects, or can even send an acknowledgment 
for a "portion" of an object. The last type of acknowledg- 
es ment is useful when the object is a very large file (say, 
video movie of several Gigabytes). When the sender 
does not receive an acknowledgement from a receiver 
within a pre-determined time, it flags the receiver's entry 
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in the ARL as unavailable, and keeps track of objects 
that should have been received, but were not. This may 
be done, for example, by noting the time when the re- 
ceiver became unavailable. 

[0025] Recovery is effected, illustratively, by polling 
the unavailable receivers at regular intervals. Once a re- 
ceiver becomes active and affirmatively responds to the 
polling signal, the sender, such as host 10, retransmits 
all the files that have been missed by the now-available 
receiver. The receivers that have not been unavailable 
receive a second copy of the objects, but that is not det- 
rimental. To minimize the down time of a receiver that 
has been made unavailable and then was made avail- 
able, the EUReCa protocol permits such a receiver to 
send a message that informs host 1 0 that it is now avail- 
able to receive objects. 

[0026] In EUReCa-R, the sender does not explicitly 
keep track of the receivers* status. Rather, it transmits 
objects with a time stamp and a sequence number, and 
leaves the responsibility of reliable delivery to the re- 
ceiver. It also sends a "heartbeat 0 message on a peri- 
odic basis. A receiver detects that something is wrong 
when it misses more than a predetermined number of 
the "heartbeat" messages, when it detects a missing ob- 
ject because the sequence is off; when it does not re- 
ceive an object completely, or when it becomes availa- 
ble after being unavailable for some time. When the re- 
ceiver misses an object, it requests a retransmission of 
the missed object based on the missing object's se- 
quence number. When the receiver has been unavaila- 
ble for a while and then becomes available, it provides 
the sender with the last timestamp and the size of a file 
it received from the sender (in case it only partially re- 
ceived an object). Based on this timestamp, the sender 
retransmits the object(s) and/or portions of an object 
that need(s) to be retransmitted. 
[0027] The above disclosure addresses a push-pull 
service architecture that is based on the existing Internet 
infrastructure. We realized, however, that other mecha- 
nisms, which are well known but not used in the Internet, 
offer a more efficient approach for distributing push- 
service information. In particular, we realized that wire- 
less technology, such as satellite communication, cellu- 
lar communication, etc., as well as cable technology are 
both suited extremely well for distribution of push-pull 
sen/ice information. FIG. 2, therefore, shows the FIG. 1 
network (with links 201 -21 3 not shown for sake of clarity) 
that further comprises a wireless transceiver unit 600, 
and corresponding units within each of the routers that 
terminate with an antenna. 

[0026] Unit 600 may be a satellite that broadcasts to 
all of the routers, while the units in each of the routers 
have a receiver and a transmitter to up-link to the satel- 
lite. Of course, the depiction of FIG. 2 is merely illustra- 
tive, and other means may also be employed. For ex- 
ample, broadcast can be effected with a network of cel- 
lular stations instead of a satellite. Also, the broadcast 
can be directly to the cache servers, rather than to the 



routers. In operation, host 10 transmits its information 
to unit 600 via an uplink channel, and unit 600 broad- 
casts that information to all of the routers, or to all of the 
cache servers, as the case may be. 

s [0029] FIG. 2 also includes a cable system, which 
may alternatively be used. The cable system shown is 
a "daisy chain" system, which begins at head station 
700, visits each of the routers, and returns to the head 
station. Broadcasting from host 10 is effected by host 

10 1 0 sending information on an "uplink" channel of the ca- 
ble to head station 700, and head station 700 broadcast- 
ing the information on a downlink channel, sending the 
broadcast signal around the loop. Cable 710 can be a 
coax cable that sends electrical signals or it can be a 

15 fiber-optical cable. 

[0030] The above presents the principles of this in- 
vention and it should be appreciated that various mod- 
ifications are possible that are encompassed by the dis- 
closed principles. For example, the above discloses the 

20 notion that transmission through the Internet network 
links is carried out using a multi-cast protocol. Actually, 
it could encompass various hybrid arrangements. For 
instance, an ISP provider that owns a number of cache 
servers may designate one of its cache servers as the 

25 interface to various push service providers (such as host 
10), and assume responsibility of spreading, or dispers- 
ing, the received information among its cache servers. 
Such spreading could be by simply multicasting 
throughout a fixed tree that connects its cache servers, 

30 but other approaches are also possible. 



Claims 

35 1 . in a network comprising routers and links that inter- 
connect said routers, as well as hosts, cache serv- 
ers and client computers that are coupled to said 
routers, a method for providing information to a plu- 
rality of client computers that subscribe to a push 

40 service of one of said hosts serving as a push serv- 
ice provider, comprising the steps of: 

said push service provider communicating in- 
formation to selected ones of said cache serv- 
45 ers that are assigned to service said plurality of 

client computers; 

said selected ones of said cache servers stor- 
ing said information; and 
when one of said plurality of client computers 
50 requests some of said information, a cache 

server to which said one of said plurality of cli- 
ent computers is assigned providing the re- 
quested information. 

55 2. The method of claim 1 where said client computers 
are assigned to cache servers that are most directly 
connectable to said client computers. 
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3. The method of claim 1 where assignment of client 
computers to cache servers is changeable. 

4. The method of claim 1 where said client computers 
are dynamically assigned to cache servers based 
on loads of said cache servers. 

5. The method of claim 1 further comprising a step of 
assigning said client computers to said cache serv- 
ers. 

6. The method of any of the preceding claims where 
said step of push service provider communicating 
information is carried out via said links, employing 
a multicasting protocol. 

7. The method of claim 6 where said step of push serv- 
ice provider communicating information is carried 
out following a step of establishing a multicasting 
transmission tree. 

8. The method of any of claims 1 to 5 where said step 
of push service provider communicating informa- 
tion is carried out via any combination of transmis- 
sion elements taken from a set comprising said 
links, wireless connection between said push serv- 
ice provider and said cache servers, and a cable 
connection that couples the push service provider 
to a cable head station, and couples the cable head 
station to said cache servers. 

9. The method of claim 16 where said step of push 
service provider communicating information is car- 
ried out via a broadcast medium that couples said 
push service provider to said selected ones of said 
cache servers, for example coaxial or optical cable 
that connects to said cache servers. 

10. The method of any of the preceding claims where 
said step of push service provider communicating 
information is carried via an application level proto- 
col that provides object-level guaranteed delivery. 

11. The method of any of claims 1 to 9 where said step 
of push service provider communicating informa- 
tion is carried out with a protocol for retransmission 
of objects that have not been successfully commu- 
nicated to a server. 

12. A method comprising the steps of: 

specifying for a client computer a cache server, 
establishing a request for a push service pro- 
vider to transmit information to said cache serv- 
er whenever said push service provider choos- 
es to update content of information that it offers, 
accepting and storing information transmitted 
by said push service provider, and 



delivering said information from said cache 
server to said client computer, upon request for 
said information from said client computer. 
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